RGBA -> RGB default background color vs padding color

Most image processors will convert RGBA images to RGB by somehow incorporating the alpha channel. With Pixtral, they use the alpha channel as weight for the RGB channels on a white background, ie. if you pass in an RGBA image, the “transparent” part ends up white.

However, the Pixtral image processor also uses the default pad function from the image_transforms.py file, which by default will pad with black pixels.

Doesn’t this mean you could end up with an image that has a white background with a black padded border? It doesn’t really seem right but I want to know if it is intentional.

1 Like

For example, the image dimension conversion and padding functions used in Hugging Face Transformers, such as Qwen 2 VL, seem to black out the alpha channel. This is probably not intentional. Or rather, it probably doesn’t have much meaning. It’s likely that the developers are just concerned with getting consistent results.
(I think it’s probably left as is because in larger models, whether the background is black or white doesn’t seem to have a significant impact on the results…)

Regarding Pixtral, it seems to have been white-filled from the first commit, so I wonder if the model was trained with that in mind. Since there doesn’t seem to be any discussion about it, I think you’d have to ask the committer on GitHub for clarification.